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Abstract. A wide literature is available on the asymptotic behavior of the 
Durbin- Watson statistic for autoregressive models. However, it is impossible to 
find results on the Durbin- Watson statistic for autoregressive models with adap- 
tive control. Our purpose is to fill the gap by establishing the asymptotic behavior 
of the Durbin Watson statistic for ARX models in adaptive tracking. On the one 
hand, we show the almost sure convergence as well as the asymptotic normality of 
the least squares estimators of the unknown parameters of the ARX models. On 
the other hand, we establish the almost sure convergence of the Durbin- Watson 
statistic and its asymptotic normality. Finally, we propose a bilateral statistical 
test for residual autocorrelation in adaptive tracking. 



1. Introduction and Motivation 

The Durbin- Watson statistic was introduced in the pioneer works of Durbin and 
Watson [6], [7], [8], in order to detect the presence of a first-order autocorrelated 
driven noise in hnear regression models. A wide hterature is available on the asymp- 
totic behavior of the Durbin- Watson statistic for linear regression models and it is 
well-known that the statistical test based on the Durbin- Watson statistic performs 
pretty well when the regressors are independent random variables. However, as soon 
as the regressors are lagged dependent variables, which is of course the most attrac- 
tive case, its widespread use in inappropriate situations may lead to bad conclusions. 
More precisely, it was observed by Malinvaud [12] and Nerlove and Wallis [12] that 
the Durbin- Watson statistic may be asymptotically biased if the model itself and 
the driven noise are governed by first-order autoregressive processes. In order to 
prevent this misuse, Durbin [5] proposed a redesigned alternative test in the par- 
ticular case of the first-order autoregressive process previously investigated in [15] , 
|16j . More recently, Stocker [20] provided substantial improvements in the study of 
the asymptotic behavior of the Durbin- Watson statistic resulting from the presence 
of a first-order autocorrelated noise. We also refer the reader to Bercu and Proia 
[2] for a recent sharp analysis on the asymptotic behavior of the Durbin- Watson 
statistic via a martingale approach. 
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Moreover, far as the authors know, there are no estabhshed results on the Durbin- 
Watson statistic for autoregressive models with exogenous control. Therefore, our 
purpose is to investigate the asymptotic behavior of the Durbin- Watson statistic for 
the ARX(p, q) processes where p > 1 and q > 0. We focus our attention on the 
ARX(p, 0) process, given for all n > 0, by 



k=l 



in which the driven noise (Sn) follows the first-order autoregressive process 



:i.2) 



-n+l 



pSn + Vn+1. 



We assume that the serial autocorrelation parameter satisfies |p| < 1 and the initial 
values Xq, Eq and Uq may be arbitrarily chosen. In all the sequel, we also assume 
that (Vn) is a martingale difference sequence adapted to the filtration F = 
where J^n stands for the a-algebra of the events occurring up to time n. Moreover, 
we suppose that, for all n>0, K \V^_^i\J^n\ = cr^ a-s. with cr^ > 0. Denote by 9 the 
unknown parameter of equation (1.1) 



92 



Our goal is to deal simultaneously with three objectives. The first one is to pro- 
pose an efficient procedure in order to estimate the unknown parameters 9 and p of 



the ARX(p, 0) process given by (1.1) and (1.2). The second one is to regulate the 
dynamic of the process (X„) by forcing X„ to track step by step a predictable refer- 
ence trajectory (a;„). This second objective can be achieved by use of an appropriate 
version of the adaptive tracking control proposed by Astrom and Wittenmark |T]. 
Finally, our last objective is to establish the aymptotic properties of the Durbin- 
Watson statistic in order to propose a bilateral test on the serial parameter p. 

The paper is organized as follows. Section 2 is devoted to the parameter estimation 
procedure and the suitable choice of stochastic adaptive control. In Section 3, we 
establish the almost sure convergence of the least squares estimators of 9 and p. 
The asymptotic normality of our estimates are given in Section 4. We shall be able 
in Section 5 to prove the almost sure convergence of the Durbin- Watson statistic 
as well as its asymptotic normality, which will lead us to propose a bilateral statis- 
tical test for residual autocorrelation. Some numerical simulations are provided in 
Section 6. Finally, all technical proofs are postponed in the Appendices. 
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2. Estimation and Adative Control 



Relation (1.1) can be rewritten as 

(2.1) X„+i = ^Vn + f^n + 

where 

Xn-1 



n— p+1 



A naive strategy to regulate the dynamic of the process (X„) is to make use of the 
Astrom-Wittenmark pLj adaptive tracking control 



Un 



where On stands for the least squares estimator of 9. Unfortunately, we can show 
that this strategy leads to biased estimation of the parameters 6 and p. This is due 
to the fact that (£„) is not a white noise but the first-order autoregressive process 



given by (1.2). Consequently, it is necessary to adopt a more appropriate strategy 



which means a more suitable choice for the adaptive control f/„ in (2.1) 



The construction of our control law is as follows. Starting from (1.1) together with 



( |1.2[ ), we easily deduce that the process (X^) satisfies the fundamental ARX(p + l, 1) 

+ {dp — pdp-l)Xn-p+l 



equation given, for all > 1, by 

= (ei + p)X„ + (e2-p^i)X„_i + 

(2.2) -pOpXn^p + Un- pUn-1 + K+1 

which can be rewritten as 

(2.3) Xn+l = + Un + Vn+1 

where the new parameter G MP^"^ is defined as 



(2.4) 











i-'i 











and the new regression vector $„ is given by 




The original idea of this paper is to control the model (2.1 ) using the adaptive control 



associated with the model (2.3) in order to a posteriori estimate the parameters 9 
p via the estimator of the parameter We shall now focus our attention on the 
estimation of the unknown parameter -(9. We propose to make use of the least squares 
estimator which satisfies, for all n > 0, 



(2.5) 
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where the initial value may be arbitrarily chosen and 

n 

Sn = Yl '^k'^k + W2 

k=0 

where the identity matrix 1^+2 is added in order to avoid useless invertibility as- 
sumption. On the other hand, we are concern with the crucial choice of the adap- 
tive control Un- The role played by [/„ is to regulate the dynamic of the process 
(Xn) by forcing X„ to track step by step a predictable reference trajectory (xn). In 
order to control the dynamic of {X^) given by (1.1), we propose to make use of the 
Astrom-Wittenmark pQ adaptive tracking control associated with (2.3) and given, 
for all n > 0, by 

(2.6) Un = Xn+l - ^* $n. 



This suitable choice of f/„ will allow us to control the dynamic of the process (2.1) 
while maintaining the optimality of the tracking and then estimate without bias the 
parameters 9 and p. In all the sequel, we assume that the reference trajectory (a;„) 
satisfies 



(2.7) Y.'^l = o{n 



a.s. 

fe=i 

3. Almost sure convergence 

All our asymptotic analysis relies on the following keystone lemma. First of all, 
let L be the identity matrix of order p + 1 and denote by H the positive real number 

(3.1) H = Y,{e, + p'f + P—. 

k=l ^ 
In addition, ioi 1 < k < p, let Kk = —{Ok + p^) and denote by K the line vector 

(3.2) K= (o,Ki,K2,...,Kj,y 
Moreover, let A be the symmetric square matrix of order p + 2, 

(3.3) A ^ ^ ^* 



K H 



Lemma 3.1. Assume that (Vn) has a finite conditional moment of order > 2. Then, 
we have 

(3.4) lim -Sn = a^A a.s. 

n— >c« n 



where the limiting matrix A is given by (3.3). In addition, as soon as the correlation 
parameter p ^ 0, the matrix A is invertible and 

(3.5) A- '-P'fSL + K-K -K' 



p2{p+l) \ 1 
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where S = H — is the Schur complement of L in A, 

l2(p+l) 



(3.6) 



S 



P 



1 -p2 

Proof. The proof is given in Appendix A. 

Remark 3.1. As L is the identity matrix of order p + 1, we clearly have 

.2(p+l) 



□ 



det(A) 



P 



1 -p2 



Consequently, as long as p ^ 0, det(A) ^ which of course implies that the matrix 
A is invertible. The identity ( |3.5[ ) comes from the block matrix inversion formula 
given e.g. by Horn and Johnson [13], page 18. 

The almost sure properties of the least squares estimator -dn of 'd are as follows. 

Theorem 3.1. Assume that the serial correlation parameter p ^ and that (V„) 
has a finite conditional moment of order > 2. Then, dn converges almost surely to 



(3.7) 



o 



logn 



n 



a.s. 



Proof. The proof is given in Appendix A. □ 
We shall now explicit the estimators of 9 and p and their convergence results. It 



follows from (2.4) that 
(3.8) 



where A is the rectangular matrix of size (p + 1) x (p + 2) given by 



(3.9) 



A 



/I ■■■ 




... 


1 \ 


p 1 




... 


p 


p 1 





... 


p' 


pP-1 pP-2 ... 


p 


1 


pp-l 


\ ■•• 




... 


-1/ 



Consequently, a natural choice to estimate the initial parameters 9 and p is to make 
use of 



(3.10) 
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where p„ is simply the opposite of the last coordinate of and 



(3.ii: 



A. 



/ i 








i \ 


Pn 


1 





Pn 


Pn 


Pn 1 ••• 





Pn 




Pr' ■■■ Pn 1 





pr' 


V 








-I J 



Corollary 3.1. Assume that the serial correlation parameter p ^ and that (Vn) 
has a finite conditional moment of order > 2. Then, On and p„ both converge almost 
surely to 9 and p, 

logn' 



(3.12) 



(3.13) 



n 



{pn - pf 



O 



log 72 



n 



a.s. 



a.s. 



Proof. One can immediately see from (3.8) that the last component of the vector 



is —p. The same is true for the estimator p„ of p. Consequently, we deduce from 



(3.7) that /x„ converges a.s. to p with the almost sure rate of convergence given by 



(|3.13|). Therefore, we obtain from (3.9|) and (|3.11|) that 

A ||2 



A. 



O 



log n 



n 



a.s. 



which ensures via (3.7) and (3.10) that On converges a.s. to 9 with the almost sure 
rate of convergence given by (3.12). □ 



4. Asymptotic Normality 

This Section is devoted to the asymptotic normality of the estimators associated 
with 6 and p which is obtained from the asymptotic normality of the least squares 
estimator -dn oi 'd. 

Theorem 4.1. Assume that the serial correlation parameter p ^ and that {Vn) 
has a finite conditional moment of order > 2. In addition, suppose that (x„) has the 
same almost sure regularity as (Vn). Then, we have 



(4.1) 



where the matrix A is given by (3.5). 



In order to provide the joint asymptotic normality of the estimators of 6 and p, 
denote, for all 1 < < p — 1, 

k 
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and let V be the rectangular matrix of size (p + 1) x (p + 2) given by 



(4.2) 



Corollary 4.1. Assume that the serial correlation parameter p ^ and that [Vn) 
has a finite conditional moment of order > 2. In addition, suppose that (a;„) has the 
same almost sure regularity as (Vn). Then, we have 



1 1 





... 


1 


\ 


p 


1 ■■• 


... 








p 1 


... 








pp-^ ■■■ p 


1 


p"-' - ip- 


1 


\ 





... 


-1 


/ 



(4.3) 

where S = VA"-'^V*. In particular. 



On - e 

Pn- P 



A/'(0,S) 



(4.4) 



C 



n{pn - p) — ^ A/" 0, 



p2(p+l) J ■ 



Proof. The proof is given in Appendix B. 



□ 



5. On the Durbin Watson statistic 

We now investigate the asymptotic behavior of the Durbin- Watson statistic [6], 
[7], [Ej given, for all n > 1, by 

(5.1) X = ^'='J!: ~!t'^' 

where the residuals Sk are defined, for all 1 < < n, by 

(5.2) ek = Xk- Uk-i - OnV^k-i 



with 9n given by (3.10). The initial value e'o may be arbitrarily chosen and we take 
£o = Xq. One can observe that it is also possible to estimate the serial correlation 
parameter p by the least squares estimator 

(5.3) ^'==1^'^'-' 



En 
k=l 



^k~l 



which is the natural estimator of p in the autoregressive framework without control. 
The Durbin- Watson statistic is related to p„ by the linear relation 

(5.4) 5„ = 2(l-pJ + C„, 

where the remainder term (n plays a negligeable role. The almost sure properties of 
Dn and pn are as follows. 
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Theorem 5.1. Assume that the serial correlation parameter p^O and that (V^) has 
a finite conditional moment of order > 2. Then, p„ converges almost surely to p, 

logn' 



(5.5) 



{pn - py 



o 



n 



a.s. 



In addition, Dn converges almost surely to D = 2(1 — p). Moreover, if (Vn) has a 
finite conditional moment of order > 4, we also have 

2 / log n ' 



(5.6) 



D„-D 



O 



n 



a.s. 



Our next result deals with the asymptotic normality of the Durbin- Watson statis- 
tic. For that purpose, it is necessary to introduce some notations. Denote 

/ 1 \ / 1 \ 

h 

and (3 



(5.7) 

In addition, let 
(5.8) 



a 



-dp 



P 



P 

V ° / 



7 = Aa + (1 - p2)v*/3. 



Theorem 5.2. Assume that the serial correlation parameter p 7^ and that {Vn) 
has a finite conditional moment of order > 2. In addition, suppose that (a;„) has the 
same almost sure regularity as (Vn). Then, we have 



(5.9) 



C 



MPn-p)^^{0,r') 



where the asymptotic variance = (1 — p^)^7*A -'^7. Moreover, if (Vn) has a finite 
conditional moment of order > A, we also have 

(5. 10) V^{Dn -D)-^U (0, 4r2) 

Proof. The proofs are given in Appendix C. □ 



Remark 5.1. It follows from (3.5) together with tedious hut straighforward calcu- 
lations that for all p > 1, 

a-p') 



T 



P 



,2{p+l) 



(5.11) 



+ (l-(p+l)p'P+(p-l)p2(P+l) 



For example, in the particular case p = 1, we obtain that 

(l-p2 



(5.12) 



r 



l-4p2 + 8p^-7p6 + 4p«-pi° 
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Moreover, it is not hard to see by a convexity argument that we always have for all 
P>1, 

< — 

- p2(p+l) 

In other words, the least squares estimator p„ performs better than pn for the esti- 
mation of p. It means that a statistical test procedure built on the Durbin-Watson 
statistic should be really powerful. 

We are now in the position to propose our new bilateral statistical test built on 
the Durbin-Watson statistic D„. First of all, we shall not investigate the case p = 
since our approach is only of interest for ARX processes where the driven noise is 
given by a first-order autoregressive process. For a given value po such that |po| < 1 
and po 7^ 0, we wish to test whether or not the serial correlation parameter is equal 
to Po- It means that we wish to test 



'Ho : "p = Po" against Hi : "p 7^ po" ■ 



According to Theorem 5A^, we have under the null hypothesis T-Lq 

Do a.s. 



lim Dr, 

n—>oo 



where Do = 2(1 — po). In addition, we clearly have from (5.10) that under Ho 

,2 



(5.13) 



n 
4^2 



D„ - D 



X 



where stands for a Chi-square distribution with one degree of freedom. Via (5.11 ), 
an efficient strategy to estimate the asymptotic variance is to make use of 



Pr. 



75 2(p+l) 
rn 



p 2(P-M) 4 - (4p + 3)p„^- + 4pp5^+i) - p„^(2^+i: 



(5.14) 



+ {1-{P + 1)-Pl' + {P-1)-Pl^'^'^) 



Therefore, our new bilateral statistical test relies on the following result. 

Theorem 5.3. Assume that the serial correlation parameter p 7^ and that (V^) 
has a finite conditional moment of order > 4. In addition, suppose that (x„) has the 
same almost sure regularity as (Vn)- Then, under the null hypothesis Tio '■ "p = Po", 



(5.15) 



n 
4f2 

n 



Dn-Do 



A 2 



where stands for a Chi-square distribution with one degree of freedom. In addition, 
under the alternative hypothesis "Hi : "p 7^ po", 

2 



(5.16) 



lim 



n 
4r " 



Dn-D, 



-00 



a.s. 
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Proof. The proof is given in Appendix C. 



□ 



From a practical point of view, for a significance level a where < a < 1, the 
acceptance and rejection regions are given by ^ = [0, Oq,] and H =]aQ,+oo[ where 
aa stands for the (1 — a)-quantile of the Chi-square distribution with one degree of 
freedom. The null hypothesis "Hq will be accepted if 

Dn - -Do) < Oq, 



n 



4f2 

' n 



and will be rejected otherwise. 



6. Numerical Experiments 

The purpose of this section is to provide some numerical experiments in order 
to illustrate our main theoretical results. In order to keep this section brief, we 
shall only consider the ARX(p, 0) process (X. 
p = 1 and p = 2, where the driven noise [Sn] 
of simplicity, the reference trajectory 



given by (1.1) in the particular cases 



satisfies (|1.2|). Moreover, for the sake 

is 



{xn) is chosen to be identically zero and (K^^ 
a Gaussian white noise with A/^(0, 1) distribution. Finally our numerical simulations 
are based on 500 realizations of sample size = 1000. First of all, consider the 
Ai?X(l,0) process given, for all n > 1, by 



(6.1) 



n+l 



Sn+i and Sn+l — p£n + Vf 



n+l 



8/5 and p = —4/5 which implies that D = 18/5 and 
16^/15^. This choice has been made in order to obtain 



where we have chosen 6 
the Schur complement S 
simple expressions for the matrices A and S. One can easily see from (3.2) to (3.5) 
that 

^45 
A = 4 f 45 -36 
-36 80 




as well as 



S = VA" V* 



1 




15 
16 



Figure 1 illustrates the almost sure convergence of On, 
that the almost sure convergence is very satisfactory. 



1 -1 
-1 1 

Pn: Pn 



and Dr.. One can see 



Almost sure corrv/'ergence of the l_S estimates 



1 OO 200 300 -400 500 SOO SOO 900 1 OOO 
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A. I most sure oonvergence of tho D\A/ strati stios 























f 





































































































O 1 OO SOO 300 -400 soo soo "?-oo SOO &00 1 ooo 



Figure 1. Almost sure convergence in the particular case p = 1. 



We shall now focus our attention to the asymptotic normality. We compare the 
empirical distributions of the LS estimates 



'tis 



On — 



and 



nSipn- p 



ViTs 

with the standard A/'(0, 1) distribution. We proceed in the same way for the Durbin- 



Watson statistics 



T 



Pn- P 



and 



'n 
27 



Dn-D 



where r is given by (5.12). We use the natural estimates of S and r by replacing 



P by pn and respectively. One can see in Figure 2 that the approximation by a 
standard A/'(0, 1) distribution performs pretty well. These results are very promising 
in order to built a statistical test based on these statistics. 




Figure 2. Asymptotic normality in the particular case p = 1. 
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Next, we are interested in the ARX{2, 0) process given, for all ti > 1, by 

(6.2) Xn+l = OiXn + 92Xn-l + Un + ^n+l and £n+l = pSn + K+1 

where we have chosen 6i = 1, ^2 = 4/5 and p = —9/10 which leads to D = 19/5 
and S = 97(19 X 10^). It follows from (Q to Q that 



A 



9500 



/9500 \ 

9500 -950 

9500 -15295 

y -950 -1529 51292 / 



In addition, the diagonal entries of the covariance matrix S = VA ^V* are respec- 
tively given by 



1 + 



1 _ 721441 
S ~ 531441' 



. 2 4p2 



1947541 
656100 ' 



1 _ 190000 
S ~ 531441' 



Figure 3 shows the almost sure convergence of 9n,i, On,2, Pn, Pn and Dn while Figure 
4 illustrates their asymptotic normality. As in the case p = 1, one can observe that 
the approximation by a standard A^(0, 1) distribution works pretty well. 



y\ I most s I J rt-2 c ; c J r 1 \/ c j r cj a r 1 t-3 of 1 1 1 1-> I S esti m ^tes 




1 00 300 400 soo eoo -T-oo soo 900 1 000 



y^lmost sure convergenoe of the OW ststistios 



-1 00 200 300 400 soo soo "7-00 SOO 900 1 OOO 



Figure 3. Almost sure convergence in the particular case p = 2. 



ASYMPTOTIC BEHAVIOR OF THE DURBIN- WATSON STATISTIC FOR ARX PROCESSES 13 



CLT for the LS estimates 



CLT for the DW statistics 





Figure 4. Asymptotic normality in the particular case p = 2. 



We shall achieve this section by illustrating the behavior of the Durbin- Watson 
statistical test. We wish to test "Hq '■ "p = Po" against "Hi : "p 7^ po" at 5% level 



of significance for the ARX processes given by (6.1) and (6.2). More precisely, we 
compute the frequency for which Ho is rejected for different values of po, 

P (rejecting T-Lq \ Tii is true) 

via 500 realizations of different sample sizes N = 50, 100 and 1000. In Tables 1 and 
2, one can appreciate the empirical power of the statistical test which means that 
the Durbin- Watson statistic performs very well. 



DW 


Values of po 


-0.9 


-0.8 


-0.7 


-0.6 


-0.4 


-0.2 


0.2 


0.4 


0.6 


0.7 


0.8 


0.9 


N=50 


0.20 
(0.80) 


0.02 
(0.98) 


0.12 
(0.88) 


0.38 
(0.62) 


0.79 
(0.21) 


0.95 
(0.05) 


0.99 
(0.01) 


0.99 
(0.01) 


0.99 
(0.01) 


0.99 
(0.01) 


1.00 
(0.00) 


1.00 
(0.00) 


N=100 


0.51 
(0.49) 


0.03 
(0.97) 


0.25 
(0.75) 


0.66 
(0.34) 


0.97 
(0.03) 


0.99 
(0.01) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


N=1000 


1.00 
(0.00) 


0.05 
(0.95) 


0.99 
(0.01) 


1.00 
(0.01) 


1.00 
(0.00) 


1.00 

(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 



Table 1. Durbin- Watson test in the particular case p = 1 and p = —0.8. 



DW 


Values of po 


-0.9 


-0.8 


-0.7 


-0.6 


-0.4 


-0.2 


0.2 


0.4 


0.6 


0.7 


0.8 


0.9 


N=50 


0.06 
(0.94) 


0.17 
(0.83) 


0.52 
(0.48) 


0.76 

(0.24) 


0.92 
(0.08) 


0.96 
(0.04) 


0.99 
(0.01) 


0.99 
(0.01) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


N=100 


0.05 
(0.95) 


0.38 
(0.62) 


0.82 
(0.18) 


0.95 
(0.05) 


0.99 
(0.01) 


0.99 
(0.01) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


N=1000 


0.05 
(0.95) 


1.00 

(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 


1.00 
(0.00) 



Table 2. Durbin- Watson test in the particular case p = 2 and p = —0.9. 
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Appendix A 

PROOFS OF THE ALMOST SURE CONVERGENCE RESULTS 



Denote by A and B the polynomials given, for all z G C, by 



(A.l) 



and 



B{z) = l-pz 



k=l 



where ai = 9i + p, Op+i = —p9p and, foi 2 < k < p, 

O'k = Ok — pOk-i- 



The fundamental ARX(p + 1, 1) equation given by (2.2) may be rewritten as 
(A.2) A{R)Xn = B{R)Un-i + K 

where R stands for the shift-back operator RXn = Xn-i- On the one hand, B{z) = 
if and only if 2; = 1/p with p 7^ 0. Consequently, as |p| < 1, -B is clearly causal and 
for all 2; G C such that \pz\ < 1, 



B-Hz) 



1 ~ pz 



k=0 



On the other hand, let P be the polynomial given, for all 2; G C, by 

00 

(A.3) P{z) = B-\z){A{z) - 1) = Y^PkzK 



k=l 



It is not hard to see from (A.3) that, for 1 < A; < p, pk 
k > p + 1, pk = 



% + P^) while, for all 



p^. Consequently, as soon as p 7^ 0, we deduce from [3] that the 



process given by (A.2) is strongly controllable. One can observe that in our 



situation, the usual notion of controllability is the same as the concept of strong 
controllability. To be more precise, the assumption that p 7^ implies that the 
polynomials A — 1 and 5, given by (A.l), are coprime. It is exactly the so-called 



controllability condition. We refer the reader to [5] for more details on the links 
between the notions of controllability and strong controllability. Finally, we clearly 



obtain Lemma 3.1 and Theorem 3.1 from (2.3) together with Theorem 5 of |3j. □ 



Appendix B 

PROOFS OF THE ASYMPTOTIC NORMALITY RESULTS 



Theorem 4.1 immediately follows from Theorem 8 of [3]. We shall now proceed 

1, 



to the proof of Corollary 4.1 First of all, denote for Q <k <p 

k+l 

skm = Y.p^ 



i=l 
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where p = —'dp+2 and Sp{^) = p. In addition, let 
(B.l) g{^) = A^ = 

V sM ) 

One can easily check that the gradient of the function g is given by 



(B.2) 



/I ■■■ 




... 


p 1 




... 


p 1 





... 




p 


1 


V ■■■ 




... 



p-im 

^2 



where ^o(^) = 1, ^p{0) = -1 and, for all 1 < A; < p - 1, 

k 

1=1 

The gradient of g coincides with the matrix V given by (4.2). On the one hand, it 
follows from (3.8) and (B.l) that 

(B.3) g{d) = 



On the other hand, we already saw from ( |4.1[ ) that 



(B.4) 



Consequently, we deduce from (B.3) and (B.4) together with the well-known delta 
method that 



On — d 
Pn- P 



c 



Ar(0,S) 



where S = VA V , which completes the proof of Corollary 4.1 



□ 



Appendix C 

PROOFS OF THE DURBIN-WATSON STATISTIC RESULTS 



Proof of Theorem |5.1[ We are now in position to investigate the asymptotic 
behavior of the Durbin- Watson statistic. First pf all, we start with the proof of 
Theorem 5.1 Recall from (2.1) together with (5.2) that the residuals are given, for 
all 1 < /c < n, by 

(C.l) ek = Xk- Uk-i - e^ipk^i = ek- 0nVk-i 

where 6n = On — 0. For all n > 1, denote 



E 

k=l 



^k^k-1 



and 



fe=0 



16 BERNARD BERCU, BRUNO PORTIER, AND VICTOR VAZQUEZ 

It is not hard to see that 

(C.2) In = e'oe'i + - O^Qi + 6'^S'^_i6'„, 

(C.3) Jn = Eq + Pn - '^(^nQi + dn^n-l^^n 



where 



k=2 k=2 k=l 

and 

n n n 

Pi = 5^ 4, Qi = ^ vk-iSk, si = Y^ y^kd- 

k=l k=l k=0 

We deduce from ( |1.2[ ) that 

(C.4) (1 - p')Pi = p'iel - el) + 2pNn + 

where 

n n 

Nn = Yl ^k-iVk and = ^1 ^k- 

k=l k=l 

Moreover, we assume that [Vn) has a finite conditional moment of order a > 2. 
Then, it follows from Proposition 1.3 23 page 25 of [4] that 



1 " 

(C.5) lim -Yvi = a.s. 

k=l 

In addition, we also have from Corollary 1.3 21 page 23 of jl] that for all 2 < 6 < a, 

n 

(C.6) Y^lVkl^ = Oin) a.s. 

k=l 

and 

(C.7) sup \Vk\ = o(n^/^) a.s. 

l<fc<ra 



However, we clearly obtain from (1.2) that 

(C.8) sup \ek\ < \ I (\eo\ + sup \Vk\) 

l<k<n ^ iPl ^ l<fc<n. ^ 

and 

^ — f) / " 
(C.9) ^k,r< (l-|p|)" kol' + J^l^fcl' 

k=l \ k=l 

which of course implies that 

(C.IO) sup \ek\ = o{n^/^) a.s. 

l<fc<n 
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and 

n 

(C.ll) J]|£fc|'' = C(n) a.s. 

k=l 

In the particular case 6 = 2, we find that 

n 

(C.12) sup el = o{n) and ^e^ = 0(?7.) a.s. 



l<k<n 



k=l 



Hereafter, (Nn) is a locally square-integrable real martingale with predictable qua- 
dratic variation given, for all n > 1, by 



k=0 



Therefore, we deduce from (C.12) and the strong law of large numbers for martin- 
gales given e.g. by Theorem 1.3.15 page 20 of [4J that 

(C.13) 



Nn 

lim — = a.s. 

n-¥oo n 



Hence, we obtain from (C.4) together with (C.5), (C.12) and (C.13) that 

pj 

(C.14) - " 



lim 1^ = 

n-5>oo n 1 — 



a.s. 



Furthermore, convergence (3.4) immediately implies that 
(C.15) - ^ " ~ 



lim —S^ = cr Ip 

n— >-oo n 



a.s. 



We also obtain from the Cauchy-Schwarz inequality, (C.12) and (C.15), that 

\\Qi\\=Oin) a.s. 



Consequently, we find from the conjunction of (3.12), (C.3), (C.13) and (C.15) that 
(C.16) lim — = a.s. 

n-5-oo n 1 — 

By the same token, as 

(C.17) = pP^_, + Nn + pel - SoB^, 



it follows from (jOlSj) and ( |C.14[ ) that 

(C.18) n _ " P 



lim 

n-s>oo n 1 — p^ 



which leads via (C.2) to 
(C.19) 



a'p 



In 

hm — 

n-s>oo n i — p'^ 



a.s. 



a.s. 



Therefore, we obtain from definition (|5.3|) together with (|C.16|) and (|C.19|) that 
(C.20) 



lim p„ = lim " = p 

" Jn-l 



a.s. 



ra— >oo n— >oo 
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In order to establish the almost sure rate of convergence given by (5.5 ), it is necessary 
to make some sharp calculations. We infer from (C.2), (C.3) and (C.17) that 

(C.21) 4 - pj„_i = iV„ - Q„ + 

where Qn = {Qi - 2pQ:^_{)^9n and 

Rn = e'oe'i — £o£i + pel — pel + 0niS^_^ — pS^_2)9n- 



On the one hand, it follows from convergence (3.4) together with (3.12) and the 
Cauchy-Schwarz inequality, that 

\Qn\ = Oi^nlogn) 



and 



\Rn\ = O(logn) 



a.s. 



On the other hand, as {N)n = 0{n) a.s., we deduce from Theorem 1.3.24 page 
26 of [1] related to the almost sure rate of convergence in the strong law of large 
numbers for martingales that |A^„| 



from (jOiej) and ( |C.21[ ) that 
(C.22) (p„ - pY 



0{y/n logn) a.s. Therefore, we can conclude 
logn' 



n 



a.s. 



The proof of the almost sure convergence of Dn to D = 2{1 — p) immediately follows 
from (C.20). As a matter of fact, it follows from (5.1) that 

■JnDn = 2 — /„) + — e^. 



(C.23) 



Dividing both sides of (C.23) by Jn-i, we obtain that 

(C.24) 5„ = 2(l-/„)(l-p„)+^7. 

where 

fn = and gr, 



-2 _ -2 



J, 



n-1 



However, convergence (|C.16|) ensures that /„ and gn both tend to zero a.s. Conse- 

at 

lim Dn = 2(1 - p) 



quently, (C.20) immediately implies that 
(C.25) 



a.s. 



The almost sure rate of convergence given by (5.6) requires some additional assump- 
tion on (Vn). Hereafter, assume that the noise (Vn) has a finite conditional moment 
> 4. V 
4 that 

(C.26) 



of order > 4. We clearly obtain from (3.4), (3.12) together with (C.l) and (C.IO) 
with b 



sup = o{y/n) + o(logr;,) 

l<k<n 



n] 



a.s. 



which leads by (C.16|) to 
(C.27) /„ = , 



1 



and 



9n 



a.s. 



In addition, it follows from (|C.24|) that 
(C.28) Dr. 



D = -2(1 - /„) (p„ - p) + 2(p - l)/„ + gn 
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where D = 2(1 — p). Consequently, we obtain by (C.22) and (C.27) that 
(C.29) 



(5„ -Dy = o((p„ - pY) + = o(^ 



\ogn 



n 



a.s. 



which achieves the proof of Theorem 5.1 



□ 



Proof of Theorem 5.2[ Th e proof of Theorem 5^ is much more difficult to handle. 
We already saw from (C.21) that 

(C.30) Jn-l{-p^- p)=N^-Qn + Rn 

where the remainder i?„ plays a negligible role. This is of course not the case for 
Q„ = (g^ - 2pQi_^Yen. We know from Q and ( [3l0| ) that 

On — 6 
Pn- P 



(C.31) 



One can observe that in the particular case p = 1, the right-hand side of ( C.31 )reduces 
to the vector 



smce 



An = A 



1 1 
0-1 



For all 1 < < p — 1, denote 



PnP 



k—i 



i=0 

It is easily check that A„ — A can be rewritten as A„ — A = (p„ — p)An where A^-, 
is the rectangular matrix of size (p + 1) x (p + 2) given by 

/O 

1 00 1 

^ _ s„(l) 1 ••■ s„(l) 



\ 



Snip -2) Snip -3) Snil) 1 - 2) 

VO / 

It was already proven that p„ converges almost surely to p which implies that for 
all 1 < A; < p - 1, 



lim Snik) = ik + l)p'' 



a.s. 



It immediately leads to the almost sure convergence of An to the matrix A given by 
/ \ 



(C.32) A 



1 

2p 




1 








(p-l)pP-2 ip-2)pP~^ 

V 





2p 













1 

2p 








(p - l)pP-2 

y 
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Denote by ep+2 the last vector of the canonical basis of W~^'^. We clearly have from 
(lasTI that 



Pn - P 



-4+2 h^n-^ 



which implies that 
(C.33) 



where Bn = A„ — A„'(9e*^2- By the same token, let Op be the null vector of W and 
denote by Jp the rectangular matrix of size px[p + 1) given by 



Ip Op 



We deduce from (|C.31|) and (|C.33D that 
(C.34) 



On = On — 6 



On — 
pn- P 



We also have from (2.5) that 

(C.35) 

where 



Jp(A„^„- Ai9) =JpBJ^r^-^). 



k=l 



Consequently, it follows from ( [a30| , ( [0341 ) and ( [a35| that 

Jn^liPn -P)=K,- C^Mn + Rn 

where C„ = S^^B^^J^Tn with r„ = — 2pQf^_^, which leads to the main decom- 
position 

(C.36) f ~ = -^AnZn + Bn 

\Pn-p 

where 



n 



Mr, 



•^n = n\ ^Xl^ and i3„ = ( °f ' 

.•^n-l'^n "^n-l I V'^n-l-"'", 



where Op+2 stands for the null vector of . The random sequence is a 

locally square-integrable (p + 3)-dimensional martingale with predictable quadratic 
variation given, for all n > 1, by 

n-l 
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We already saw from (3.4) that 

1 " 

lim - ^k^l = a^A 



(C.37) 



a.s. 



k=0 



In addition, it follows from (C.14) that 
(C.38) 



lim - = T- 

n^oo n ^ 1 
fe=0 



a 



a.s. 



Furthermore, it is not hard to see that 

^ n 1 

hm - XkVk = lim - XkEk = cr^ 

n—>-00 n n— inn n -i— ' 



a.s. 



k=l 



k=l 



Moreover, we obtain from (1.2) that for all n > p and for all 1 < £ < p, 

e-i 



i=0 



Consequently, 



k=l 



n l-\ 

k=\ i=0 

n £—1 n 

^ Xk-iSk-t + ^P'^ Xk-tV^ 

k=l i=0 k=l 



which implies that for all 1 < £ < p. 



n 

lim - y X, 



k-iEk = (J^p^ a.s. 



fc=i 



On the other hand, we infer from (1.1) that 



Uk-l£k — ^k^k — — Xk~iSk- 

k=l k=l k=l i=l k=l 



Hence, we find that 



1 " f r? ^ 

lim - V Uk-ieu = -a^ + V 9^p' 

k=l \ ^ i=l 



a.s. 



Consequently, we obtain that 

n 

(C.39) lim - y^<!>,e, = a'( 



a.s. 



fc=l 
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where the vector ( is given by 

/ 1 \ 



(C.40) 



with 



c 



p 



\Qp J 



and 



1 -p2 



We deduce from ( [0371 ), ( [C38| and ( [a39| ) that 



(C.41) 



hm = Z 

n—>-oo 11 



a.s. 



where is the positive-semidefinite symmetric matrix given by 



(C.42) 



Z = a" 



One can observe that Z is not positive-definite as det{Z) = 0. Nevertheless, it 
is not hard to see that (Zn) satisfies the Lindeberg condition. Therefore, we can 
conclude from the central limit theorem for multidimensional martingales given e.g. 
by Corollary 2.1.10 of ^ that 



(C.43) 



Furthermore, we already saw from (C.32) that 



lim An = A 

n—^oo 



a.s. 



which implies that 



lim 5„ = A - v4t9e* 2 



a.s. 



One can easily check from (3.9) and (C.32) that 



A - A^e*+2 = V 



where the matrix V is given by (4.2). Moreover, it follows from the previous calcu- 
lation that 

lim —Tn = (T^(l — p^)T a.s. 

n-)-oo n 

where the vector T is given by 



(C.44) 



/ 1 \ 

P 
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Consequently, as C„ = S^i^i?* JpT„, we obtain from (3.4) that 



lim Cn = C 

n— >oo 



a.s. 



where 



C = (1 - p2)A-iv*J*T. 



Hence, we obtain from (3.4) and (C.16) that 
(C.45) hm An = A 

n— >oo 

where 



a.s. 



Op+2 

(l-p2)C* (l-p2) 



In addition, we clearly have from (|C.16|) that 
(C.46) 



lim Br, 



Op+2 





a.s. 



Finally, we deduce from the conjunction of (C.36), (C.43), (C.45), (C.46), together 
with Slutsky's lemma that 



Af(o,AZA' 
which leads to 

MPn -P)-^^ (0, r^) 
where the asymptotic variance is given by 

= (1 - p2)2 (^(jt^^ ^ 2C*C + Tl) ■ 

However, one can easily see from (5.7) and (5.8) that 

= (1 - p2)2 II aV2^ + (1_ ^2)^-1/2^*^ ||2^ 

= (l-p^ II A"'/'7f, 

= (i-p^)VA-S, 



which completes the proof of (5.9). Finally, (5.10) immediately follows from (5.9) 



together with (C.27) and (C.28), which achieves the proof of Theorem 5.2 



□ 



Proof of Theorem |5. 3 The proof of Theorem 5^ is straightforward. As a matter 
of fact, we already know from (5.10) that under the null hypothesis Ho, 



(C.47) 



where the asymptotic variance is given by (5.11). In addition, it follows from 
( |5.14D that 

(C.48) lim = a.s. 
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Hence, we deduce from (C.47), (C.48) and Slutsky's lemma that under the null 
hypothesis "Ho, 



2^ (5„ - Do) Ar(0, 1) 



which obviously implies (5.15). It remains to show that under the alternative hy- 
pothesis "Hi, our test statistic goes almost surely to infinity. Under "Hi, we already 
saw from Theorem 15.11 that 

lim p„ - po = p - Po a.s. 

n— >oo 

and this limit is different from zero. Consequently, 
(C.49) lim n{j)^ — po)^ = +oo a.s. 



However, we clearly find from (C.28) that 

(C.50) D„-Do = -2 (p„ - po) + e„ 

where = — 2/„(l — p„) + gn- Finally, ( C.49[ ) and (C.50) clearly lead to (5.16) 



completing the proof of Theorem 5.3 



□ 



[1 

[2; 

p: 

[4: 
[s: 

[6: 

[7; 

[9 

[lo; 

[11 
[12: 

[13: 

[14 
[1 
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